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VIDEO PLAYER AND AUTHORING TOOL 
FOR PRESENTATIONS WITH TANGENTIAL CONTENT 

BACKGROUND OF THE INVENTION 

5 This invention relates generally to multimedia presentations. Particularly, this 

invention relates to a method of generating and playing a multimedia content 
presentation in which a video is shown and tangential content is presented at preset 
points in the video. 

b In the prior art, video players have allowed computer users to play standalone 

|J) video files or streaming files from the Internet. For example, the Microsoft brand 
Q Windows operating system comes installed with the "Windows Media Player" 

iu 

5 application. As another example, Real.com provides computer users with their 
m "RealPlayer," another popular video viewing application. These players are similar to 
the common VCR in that the user accesses controls to move forward and backwards 
jj through the video, pausing at any point. 

W 

O In recent times, the corporate world has discovered the advantages of using 

webcasting and videos for training and consumer information purposes. In August 
2001 , the Wall Street Journal ran a story by Riva Richmond about the emerging industry 
of webcasting. According to experts at research firm Jupiter Media Metrix Inc., 

20 spending on webcasts for product launches may reach $567 million by 2005. Similar 
technology for employee training may become a $519 million market by 2005. 

The use of webcasting and streaming videos, and the like, as part of e-learning 
tools are cost effective. For example, one luxury car maker has created an e-learning 
training program for its new mechanics. The new program eliminates costs for travel to 
25 corporate training centers and decreases the mechanics' time away from work. From 
these cost saving advantages, the e-learning video program recovered its production 
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costs after just 50 students completed the course. A primary reason for the growth of 
such video technology is the cost savings for corporate America. 

Unfortunately, current systems that allow users to play videos (either of the 
streaming or traditional variety) or to replay recorded webcasts from a computer have 
disadvantages. First, it is difficult for the user to easily find and review portions of the 
videos. As with traditional taping through a VCR, a user must fiddle around with the 
"rewind" and "fast forward* controls to reach the desired location of the video. 
Secondly, current computer video players do not incorporate other forms of materials 
into presentations. For example, if the video mentions a website that the user should 
visit, the user must stop the video and use a separate Internet browser to manually view 
the website. Likewise, if the video includes references to other, tangential, content that 
the user may wish to access - such as other related videos, documents, or music - the 
user must find that content on his or her own at a later time. 

What is needed in the art is a way to incorporate such related information into a 
video so that the user can freely access it. Such a device should allow the user to 
easily jump back and forth from the primary video to the tangential content. Such a 
device should also allow the user to skip any or all of the tangential that is available. 

SUMMARY OF THE INVENTION 

One object of the present invention is to provide a method for viewing a 
presentation that includes tangential content. The tangential content may include other 
videos, music, documents, websites, etc. 

Another object of the invention is to associate the tangential content with the 
appropriate portion of the primary video. Thus, the user is given access to the 
tangential content only when it becomes relevant to the presentation. 

Yet another object of one embodiment of the invention is to provide a 
presentation which is divided into a series of steps. Information for each step is readily 
available to the user without requiring that user to view the entire presentation. 
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In one embodiment, the present invention is a player for a multimedia 
presentation, where the presentation includes a video as well as tangential content, 
such as supporting documents, webpages, and the like. Each segment of tangential 
content is assigned to a representative slide that will become visible (appear) at a 
5 predetermined time or frame in the video. These slides are displayed at set locations 
near the display region of the video. To access any of the tangential content the user 
selects one of the slides and the tangential content is displayed. Control panels may 
allow the user to control the video and/or tangential content The display regions for the 
video as well as the tangential content could also be combined. A presentation script is 
10 used to define the interrelationship of the slides, the tangential content, and the video so 
that presentations having tangential content can be readily created. 

Another embodiment of the present invention allows a user to create such a 
presentation using a video, and a set of tangential content. Such an authoring tool 
q allows the user to associate slides with certain times or frames of the video as well as to 
B associate what tangential content is retrieved for each slide. The output of such an 
§•■* authoring tool is a presentation script. 

[Jj Other objects and advantages of the present invention will become more 

Q apparent to those persons having ordinary skill in the art to which the present invention 

pertains from the foregoing description taken in conjunction with the accompanying 
20 drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagram of one embodiment of the present invention describing the 
components of a software system for authoring a multimedia presentation. 

Fig. 2 is a diagram of one embodiment of a video presentation as displayed to 
25 the user. 

Fig. 3 is a diagram of one embodiment of the present invention showing the 
components of a content template for a video presentation. 
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Figs. 4 and 5 are flow charts of two embodiments of a method used by a video 
authoring tool used to prepare a presentation . 

Fig. 6 is a diagram of one embodiment of the present invention, showing the 
components of a content template for a video presentation. 

5 Fig. 7 is a block diagram illustrating the video player architecture. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In one embodiment of the invention, the system shown in Figure 1 can be used 
by a user to develop a multimedia presentation 13 on a given subject. The method of 
k . the present invention uses the computer system to collect pre-existing content, such as 
fb audio content 4, video content 3, graphics/pictures 5, text 1, interactive computer 
programs (such as applets) 2, and other types of multimedia content (such as HTML 

pi 

£ content). 

W The various pieces of content (1-5) are collected for input 6 into the video 

authoring tool 6, which also has access to predefined general formats 7 of video 
f§ presentations. Each general format specification, such as the exemplary Part 1 
hi specification (8) and Part N specification (10) has its own content requirements, known 
as the part's content form 17. The content form 17 includes a shell 51, and a kernel 46 . 

Content Forms 

The content form(s) 17 of the present invention define the format of how the 
20 multimedia presentation will be presented to the user. In a given presentation, which 
may contain videos, text, audio clips, etc., there may be the need for several content 
forms - one for each type of content to be presented. The video authoring tool 6 
creates a presentation interface integrating the multimedia content by using as input 
both the content forms 17 and the multimedia content 1-5. 
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Figure 3 shows the content form 17 used in displaying a presentation to the user 
in more detail. As shown in Figure 3, the content form 17 contains a content shell 51 
and a content kernel 46. The content shell 51 is a user interface template for structuring 
various multimedia content. The content kernel 46 is one or more data files that 
5 contains all the necessary multimedia content, in the appropriate formats, for the 
content shell 51 to use. 

The content shell 51 , of the content form 17 in Fig. 3 defines a video playing in 
a main window 47. Commands 45 control the video, accompanying text 44 or other 
multimedia content and predetermined images in shortcut slides 41-43. The tangential 
10 content 44 may be information related to information in the main window 47. As the 
q video is playing, a user may access the tangential content 44. The predetermined 
images in the shortcut boxes 41-43 are selectable by a user or otherwise "activated 
O automatically" (if so scripted) and may initiate an event. For example, selecting an 
P image may cause a "jump" to a particular scene in the video or may activate other 
p multimedia content in the main window. Alternatively, if so scripted, an event not 
apparent to the user may take place, such as the billing of a credit card or the 

L..L 

[T compilation and/or submission of user profile information. The content shell 51 also 
W includes an audio source 50. The audio source is an interface to a sound source, such 
hi as a speaker. 

20 Fig. 6 is an another example content form 17 having a content shell 51 and 

content kernel 46. This content form 17 again defines a video playing in a main window 
150 including video control commands 152. As a video plays in the main window 150, 
predetermined events start occurring in shortcut slides (also known as boxes) 151 at 
predetermined times. As an example, the predetermined event may be the appearance 

25 of a predetermined image. Once an image appears in a shortcut box 151, the image is 
selectable by a mouse click or other input method, or alternatively, the slide can auto- 
select or activate itself if so programmed by a script. When a shortcut slide/box 151 is 
selected or is automatically activated, the video or other multimedia content executing in 
the main window 150 pauses and a second, tangential presentation begins. The 
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second presentation can begin in the main window 150 or anywhere else in the content 
shell 51 . The second presentation relates to the concept depicted by the selected event 
in the shortcut box 151. The second presentation can be of variable format, such as 
text, video, graphic image, interactive program, web browser, etc. In one exemplary 
5 embodiment, the second presentation becomes visible in the main window 150 and 
another control panel appears in the control command area 152 giving the user 
navigational control over the second presentation. If the second presentation is text, the 
user may be able to use scrolling, paging and other text control buttons. If the second 
presentation is a video the user may be given another set of video control buttons. 

10 Filling in the Content Shell: Video Authoring Tool 

BPS is 

■?~>\ 

p The content forms 17 represented by Fig. 3 and Fig. 6 are just two exemplary 

J:: ways of structuring the multimedia content for presentation to a viewer. The content 

shell 51 defines a main window 47, and n-number of shortcut boxes or slides 41 , 42, 43, 
i J] which ^ump" to particular playback points in the video 49 stored in the content kernel 
15 46. Fig. 3 shows, by way of example, three shortcut slides 41-43. It is important to 

note that the video playback during content editing is different from that of the video 
r f playback in the content shell as seen by a viewer during the presentation. It is to be 

understood that there may be any number of shortcut slides, and they may be 

structured in various graphical ways in the content shell 51 . 

20 Fig. 4 illustrates an exemplary method for creating a presentation script which will 

direct a multimedia presentation for the content shell in Fig. 3 where the shortcut slides 
in the content shell 51 link predetermined multimedia images or text to playback points 
of the video. In the present embodiment, the author of a new presentation first inputs a 
pre-existing video 60 into the video authoring tool . The video begins to play and the 

25 author may, at any time, use video controls 73 to control the video, such as with 

controls to fast forward, reverse, pause, stop, play, or play in slow motion. In Fig. 4, the 
controls are graphically shown with their common symbols. 
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At any desired point in the video, the author may choose and extract a playback 
point P0 (or frame) from the video 64. The playback of the video during content 
authoring is then paused 65 and a shortcut slide in the content shell 51 is associated 
with the playback point P0. A still image of the video at the playback point is captured 
5 66 and the shortcut slide in the content shell 51 is filled with the captured image 67. 
The author may also associate tangential content (such as text or a clipped video 
segment) with the added shortcut slide. A specific event is then chosen 68 for 
activation of the shortcut slide. For example, a shortcut slide may be activated during 
execution if a user clicks on it with a mouse or uses some other input method, or 
10 alternatively it can be activated automatically if so programmed by the presentation 
u script. In the exemplary embodiment illustrated in Fig. 4, the event path for activation 
of the shortcut slide is linked to playing the video in the main window at the playback 
point P0. If the author is finished with adding shortcut slide , the video editing ends 70 
K and a presentation script is generated which can later be used by a video player to play 

0 the presentation. Otherwise, the playback resumes 71 and 72. 

hi 

1 Once the video authoring tool of Figure 4 constructs a multimedia presentation, a 
f user may later play it To do so, the user accesses the content shell 51 of Figure 3 and 

hi uses the controls 45 to cause the video 49 to begin. In some embodiments, as the user 

rn 

r; progresses through the video 49 to the various playback points pre-associated with 
20 shortcut slides (41 , 42, etc.), then the shortcut slide appears in the content shell 51 . 
Thus, by the end of the video, all of the associated shortcut slides are displayed. This 
allows the user to easily jump back to critical points in the video 49 by simply selecting 
the appropriate slide. 

Some embodiments of the present invention present to the user a multimedia 
25 "floating steps" presentation. Such a presentation teaches the user a procedure, which 
is made up of a series of steps. For each step in the procedure, a shortcut slide is 
created. Thus, the user can choose to play the video from start to finish to learn the 
procedure in its entirety. Then, the user can repeat the instructions for certain steps by 
selecting the appropriate slide. This causes the video to jump to the proper point for 
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that step. Alternatively, if the user already has some knowledge about the procedure at 
hand, the user can decide not to play the video directly, but rather to simply jump to the 
portions of the video explaining the unfamiliar steps by selecting the appropriate slides. 

Fig. 5 illustrates another version of authoring a presentation. In Fig. 5, the 
5 flowchart represents multimedia content editing of a content shell where the shortcut 
boxes in the content shell link to other tangential content In the present embodiment, a 
pre-existing video is first input 100 into the content generation application 14 during 
content editingThe video begins to play 103 and the author may, at any time, use video 
controls 1 13 to fast forward, reverse, pause, stop, play, or play in slow motion the video. 

16 At any desired point in the video, the author may extract a playback point P1 

ill 

q from the video 104. The playback of the video during content authoring is then paused 

t 105 and a shortcut box / slide in the content shell is linked 106 to the playback point P1 . 

C| In one embodiment, linking a shortcut slide to the playback point P1 will cause during 

[a video playback in the content shell, an event will occur in the shortcut slide whenever 

f5 the video reaches the playback point P1 . A specific event is then chosen 1 14 for the 

jjs* h 

H shortcut box. The author may choose from a variety of event paths that will execute at 
the point P1 during video playback in the content shell. Exemplary event paths may 

O include, but are not limited to, the appearance of the still image of the video 119 taken 
at P1 , the appearance of a predetermined image 1 18, an interactive text box 117, 

20 another video 1 1 6, or an audio program 1 1 5 standing alone or in combination with any 
other event path or a web browser. For example, as illustrated in Fig. 6, if the event 
path chosen is the still image of the video 119, then during playback of the video, the 
still shot of the video taken at playback point P1 during content authoring will appear in 
the shortcut box at point P1 during playback in the content shell. 

25 The activation of the shortcut box may then be linked with another event 120, 

such as a predetermined video 121 or other form of tangential content. In such a 

situation, while viewing the presentation, if the viewer activates the shortcut box 151 by 

clicking on it, or by some other input method, the predetermined video 121 begins to 

play in the content shell. In some embodiments, the tangential content displays in a 
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region reserved for such content 44 so that the main window 47 can continue to display 
the primary video. A user may link the activation of the shortcut box 120 with a variety 
of events, such as, but not limited to, activating an interactive program 125, a web 
browser 122 which may be embedded in the content shell, an interactive text box 123, 
5 or an audio program 124 alone or in conjunction with one of the other event paths. 

The video authoring tool not only associates tangential content with the slides, 
but it also assists the user in the creation of content. For example, in one embodiment, 
the authoring tool includes an HTML editor that the user can use to create HTML 
tangential content, while in another embodiment, the authoring tool may include a 
10 function that defines a quiz, allowing the user to associate a quiz to the content. 

Once the author is finished creating shortcuts 126, the video editing ends 127. 
h Otherwise, the playback resumes 111,113. Once again, the end of editing results in a 
a presentation script being generated that can later be used to run the presentation. 

m Script-Based Multimedia Presentations 

15 As discussed above, Figures 4 and 5 illustrate methods for authoring multimedia 

content presentations with tangential content. These presentations can later be shown 

P to a user with a presentation player. The presentations built through the methods of 
Figures 4 and 5 are based upon a presentation script generated during the authoring 
process. Thus, the elements of the presentations are: the video file, images or text for 

20 the short cut slides, tangential content associated with the shortcut slides, and the 
presentation script which describes the relationships among the video, slides, and 
tangential content. Once created, a multimedia player can play the presentation to a 
user by parsing the presentation script. 

In one preferred embodiment, the video authoring tool and the video player are 
25 written in Java (or other similar language) to allow the tools to be platform independent. 
The video presentation script is generated in VXML ("Video Extensible Markup 
Language), a markup language that is compliant with XML. As is well known in the art, 
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Extensible Markup Language, otherwise known as XML, is a universal format for 
structured documents and data on the Web. XML allows for the creation of textual data 
files delimited by markup tags. XML is a World Wide Web Consortium ("W3C") 
standard and information on it can be found on W3C's website. As XML is - by its very 
5 name - extensible, various parties have created implementations of the language for 
specific purposes. For example, there is VoiceXML, IXML, and FXML, to name just a 
few. VoiceXML (also known as VXML, but not to be confused with the VXML of the 
present invention) is the speech interface framework that extends the Web to voice- 
based devices. The International Consortium for Alternative Academic Publication 
10 ("ICAAP") has developed IXML, which promises to provide sophisticated indexing and 
document handling capabilities at a very low cost. And OANDA Corporation supports 
O FXML as a simple, automatic way of retrieving currency exchange rates over the 
n Internet. The present invention includes VXML as yet another useful extension of XML. 

In the present invention, the video player parses the video presentation VXML 
IS script and reacts accordingly to properly present the video, the shortcut slides, and the 
J , tangential content associated with the shortcut slides. In the preferred embodiment, any 

new type of tangential content type can be handled by a "plugin"-type system where the 
y code to display and to execute the new content type is dynamically attached to the 
g presentation player program. With this method, the presentation player can be 
20 extended to handle an infinite number of different tangential content types. One of 

ordinary skill in the art is familiar with such "plugin"-type architecture. 

Although scripting languages are well known in the art, VXML is more robust that 
other similar languages. For example, VXML supports Boolean testing, branching, 
variable storage, resource allocation, subroutines, and the like. 

25 As in XML, VXML is a series of tags that can be used to categorize and define a 

set of data. At the highest level, every VXML script has a similar structure. All such 

scripts begin with the <VXML> tag followed by all resource-type tags within a 

RESOURCES section followed by one or more SEQUENCE sections composed of 

frame-type tags. Each of the sequence sections is parsed by the frame player module of 
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the video player when called, beginning with the sequence section labeled as "main" so 
that the presentation can be correctly played for the user. 

In a preferred embodiment, VXML is made up of three primary types of tags: 
Section tags, Resource tags, and Sequence tags. Each type supports a set of tags. A 
preferred embodiment of the syntax for the various tags will now be described. 

(1) Section Tags 

Section tags provide a framework for organizing the VXML script as a whole. In 
one embodiment, there are three Section tags: (i) <VXML>, (ii) <RESOURCES>, and 
(Hi) <SEQUENCE>. 

(a) <VXML> SECTION TAG 

The <VXML> tag is the first tag in every VXML presentation script file and it 
encapsulates all of the remainder of the script file. Thus, a VXML script file can be 
placed within another XML or HTML file. If that file does not recognize the VXML tag, 
the entire VXML portion will be ignored. The syntax of this tag is: 

<VXML: w=width_value h=height_value> 

where the width_value value determines the width of the video player and the 
height_value value determines the height of the video player. 

(b) <RESOURCES> SECTION TAG 

The <RESOURCES> tag delineates the resources section of the VXML file. 
Each resource, such as a video, is given a name and may be otherwise described. The 
<RESOURCES> tag must be supplied in every VXML file and it must appear as the first 
tagafterthe<VXML>tag. 

(c) <SEQUENCE> SECTION TAG 
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Following the <RESOURCES> tag are one or more <SEQUENCE> tags. Each 
<SEQUENCE> tag groups together a collection of <FRAME> tags. The resulting 
sequence of Frames may be "played" by a Call Frame. The first of the <SEQUENCE> 
tags must be named "main" and it will execute first. The Call Frame may optionally set 
values for variable data values, which may be referenced by the Frames of the 
sequence. 

For example the <SEQUENCE> tag may be in the form of: 

< SEQUENCE: name=my_sequence, parmO=time_to_start, 
p a rml = t ime_t o_end> 

<IMAGE: time=$time_to_start, 

url=f ile: ///c: \images\face . jpg, region=my_region> 
</IMAGE> 

<WAIT: time=$time_to_end> </WAIT> 
</SEQUENCE> 

In this example, a sequence named "my_sequence" is created. Two parameters are 
created via variables named time_to_start and time_to_end. The Image Frame and the 
Wait Frame each reference one of these variables. The actual values of the variables 
will be determined at run-time by a Call Frame. 

(2) Resource Tags 

Resource tags define time-less resources that are global in nature, such as tools, 
resources, or structures that may be accessed by various and multiple sequence tags. 
Currently there are four supported Resource tags: <REGION>, <VIDEO>, <STYLE> 
and <VARIABLE>. 

(a) <REGION> RESOURCE TAG 

The <REGION> tag is used to define an area of the screen. This tag defines a 
region of the screen that may be referenced by various Frame Tags. Each region is 
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given a name, position, and dimensions. There may be many Region Tag's, and the 
regions that they specify may overlap. The syntax for this tag is: 

<REGION : name=regiorwiame , 

x=x_value, y=y_value, w=width_value, 
5 h=height_value> 

</REGION> 

where: regionjiame is any author-selected name for the region; 

X_value indicates the x coordinate of the upper left corner of the named region; 

Y_value indicates the y coordinate of the upper left corner of the named region; 
tt Width_value indicates the width of the named region; and 
O Height_value indicates the height of the named region. 

(b) <VIDEO> RESOURCE TAG 

a 

The <VIDEO> tag is used to define a video that may be by referenced by one or 
H more Video Frame Tag's. Each video is given a name, and assigned to a named 
|a( region. Videos will not be displayed until an appropriate Video Frame is executed. The 
M syntax of this tag is: 

<VI DEO : name=video_name , url=locator_value , 
region=region_name> 

</VIDEO> 

20 where: video_name is any author-selected name for the video; 
locator_va!ue is the url of the file containing the video data; and 
regionjiame is any named region defined by a Region Tag. 
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(c) <STYLE> RESOURCE TAG 

The <STYLE> tag defines the format, color, and styles used by various Frame 
tags. There is a pre-defined Style Resource, with all default parameter values. All 
Frames whose style parameters are null use the default Style Resource. The syntax for 
this tag is: 

<STYLE : name=style_name, 

bcolor=background_color_value, 
f color=f oreground_color__value, 

f ont_name=f ont_name__value , 
f ont_style=font__style_value, 

f ont_s i z e= f ont_s i z e_va 1 ue , 

border=border_f lag, 
border_title=border_title_f lag, 

border_color=border_color_value, 

align=alignment_value> 

</STYLE> 

where: Stylejiame is any author-selected name for the style; 

Background_color_yalue, foreground__color_value and border_colorj/alue are any of 
the following: black, blue, cyan, dark gray, gray, green, light gray, magenta, orange, 
pink, red, white, yellow; 

font_namej/alue is the name of any installed font; 
Font_style__value is any of the following: bold, italic, plain or regular; 
font_size_value is any size that is valid for the specified font; 
Borderjlag is either 0 or 1, to indicate the absence or presence of a border; 



14 



File Number 20980-20 



Borderjitlejlag is either 0 or 1 , to indicate the absence or presence of a border title; 
and 

AIignment_value is one of the following: center, left, right, trailing, or leading. 

(d) <VARIABLE> RESOURCE TAG 

5 The Variable Resource provides a global mechanism for passing and storing 

data. Variable Resource values can be set via a SET Frame, and interrogated via an IF 
Frame. Clearing a Variable Resource via a CLEAR Frame sets it value back to the 
specified initial value. 

Variable Resources should not be confused with Sequence Variables, which may 
|l) be passed to Sequences via the Call Frame. The values of Sequence Variables are 
12 only valid within the Sequence in which they are defined. In particular, except via a Call 
«I Frame, there is no way for a Frame in one Sequence to set the value of a Sequence 
Variable for a Frame in a different Sequence; Variable Resources, however, are global 

iu 

in nature. Once a Variable Resource is set via a Set Frame, its value can be checked by 
|S If or Until Frames in any Sequence. 

§ : s it 

The syntax for a <VARIABLE> tag is: 

O 

<VARIABLE : name=variable_naine , initial=initial_value> 
</VARIABLE> 

where: Variabie_name is any author-selected name for the variable; and 

20 lnitial_value is any alphanumeric value, which will be the default value assigned to the 
variable. 

(3) Sequence Tags, also known as Frame Tags 

Sequence tags within a VXML script define the actions and events that comprise 
a presentation. Each sequence tag defines an action or presentation element and the 

25 time at which it is to occur. Sequence tags are also called "Frame Tags" or "Frames" 
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since frames in a movie similarly define presentation elements (i.e., the images) that 
occur at specific times. 

Currently, there are fourteen supported Frame tags in VXML, from the VIDEO tag 
which executes the specified command (such as "play") on the video, to the IF tag 
5 which performs a conditional test on a variable resource. 

(a) <VIDEO> FRAME TAG 

The VIDEO Frame tag executes the specified command on the specified video. If 
the command is play, then video playback begins at the specified media-time. If the 
command is stop, then the video is stopped, and the still frame associated with the 
5P specified media-time is displayed. The video is displayed in the screen region that was 
Q specified in the associated Video Resource. Syntax for this tag is: 

?*i <VIDEO: time=$time_value, video=$video_name, 

£1 cmd=$video_cmd, media_time=$position_value> 

</VIDEO> 

5 

|| where: time_value is the time, in milliseconds, at which the Video Frame will be 
executed; 

111 

p videojiame is the name of a video (should match a name that was specified in a Video 
Resource); 

video_cmd is either play or stop; and 

20 Position_value specifies the media time for the specified video command (a media time 
of minus one indicates that the video should start/stop wherever it is); 

(b) <IMAGE> FRAME TAG 

The IMAGE Frame displays the specified image in the specified region. The x, y, 
w, and h parameters facilitate the use of a "cropped" file. That is, the specified image file 
25 may contain a large image; but only the rectangle specified by the parameters will be 
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displayed. If negative values are specified for the width or height parameters, then the 
entire width or height of the image is used. The general syntax for this tag is: 

< IMAGE: time=$time_value, url=locator_value, 

region=$region_name A x=x_value, y=y_value, 
w=width_value. h=height_value > 

</IMAGE> 

where: time_value is the time, in milliseconds, at which the Image Frame will be 
executed; 

locator_value is the url of the file containing the image data, in JPEG format; 

region_name is the name of a region (should match a name that was specified in a 
Region Resource); 

x_value indicates the x coordinate of the upper left corner of the image within the image 
file; 

y_value indicates the y coordinate of the upper left corner of the image within the image 
file; 

width_value indicates the width of the image within the image file; and 

height_value indicates the height of the image within the image file. 

(c) <SCRIPT> FRAME TAG 

The SCRIPT tag displays the specified html data in the specified region. If the 
URL is null, then the in-line content is displayed. Otherwise, the content is taken from 
the specified URL. The syntax for this tag is: 

<SCRIPT: time=$time_value, url=$locator_value, 

region=$region_name, style=$style_name> 
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optional html content 

</SCRIPT> 

where: time_value is the time, in milliseconds, at which the Script Frame will be 
executed; 

5 locator_value is the url of the file containing the script data, in html format; 

regionjiame is the name of a region (should match a name that was specified in a 
Region Resource); and 

stylejiame is the name of a style (should match a name that was specified in a Style 
| a L Resource). 

f b (d) <LABEL> FRAME TAG 

m The Label Frame displays the specified text as a label within the specified region. If the 
Ta url is not null, then an icon will also be displayed within the label. The syntax is: 

<LABEL: time=$ time__value, url=$locator__value, 
t 5 ** region=$region_name, style=$style_name, 

fi align=$text__to__icon_alignment > 

label_text 

</LABEL> 

where: time_value is the time, in milliseconds, at which the Label Frame will be 
executed; 

20 locator_value is the url of a file containing an image to be displayed as an icon within 
the label; 

region jiame is the name of a region (should match a name that was specified in a 
Region Resource); 

style__name is the name of a style (should match a name that was specified in a Style 
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Resource); and 

text_tojcon_alignment is one of: top, bottom, left, right, or center. 

(e) <CLEAR> FRAME TAG 

The Clear Frame causes the specified Resource to revert to a "clear state". The 
exact meaning of "clear state" is dependent on the particular Resource. For a Region 
Resource, the clear state is one in which the associated screen region is erased. For a 
Video Resource, the clear state is one in which the video is stopped and the associated 
screen region is erased. For a Style Resource, the clear state has no meaning. Syntax 
for this tag is: 

< CLEAR : time=$time_value, resource=$resource_name> </CLEAR> 

where: time_value is the time, in milliseconds, at which the Clear Frame will be 
executed; and 

resourcejiame is the name of a resource (should match a name that was specified in a 
Resource Tag). 

(f) <WAIT> FRAME TAG 

The Wait Frame performs no operation. Including a Wait Frame in a Sequence 
has the effect of causing the sequence to wait until time in the specified time parameter 
has passed. The syntax for the tag is: 

<WAIT: time=$time_value> </WAIT> 

(g) <SET> FRAME TAG 

The Set Frame sets the value of the specified Variable Resource. Once set, the 
value of the Variable Resource is available to IF and UNTIL Frames in all Sequences. 
The syntax for this tag is: 

<SET: time=$time_value, variable=$variable_name, 
value=$new_value> </SET> 
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where: time_value is the time, in milliseconds, at which the Set Frame will be executed; 
and 

Variable jiame is the name of a Variable (should match a name that was specified in a 
Variable Resource Tag). 

5 (h) <UNTIL> FRAME TAG 

The Until Frame stalls the currently executing Sequence until the specified 
condition is, or becomes true. Until that occurs, no subsequent Frames will be played. 
However, Asynchronous Frames, resulting from user interaction with Button Frames or 
Entry Frames may execute while the Sequence is stalled. Presumably in this manner 

\ts\t 

the specified condition will eventually be made to be true via a Set Frame, so that the 
J ;| Sequence may continue. The syntax for this tag is: 

fy <UNTIL: time=$time_value, variable=$variable_name, 

iy test=$condition, value=$test_value> 

a 

hi </UNTIL> 

15 where: time_value is the time, in milliseconds, at which the Until Frame will be 

m executed; 

&*£ 

variable_name is the name of a Variable; (should match a name that was specified in a 
Variable Resource Tag); 

Test is one of: equal, not_equal, greaterjhan, lessjhan, greater_than_or_equal, or 
20 less_than_or_equal; and 

value is an alphanumeric value to which the specified Variable Resource will be 
compared. 

(i) <SLIDER> FRAME TAG 
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The Slider Frame tag displays a "slide-control" in the specified Region. The slide- 
control can be used to change the time during a video presentation. That is, this control 
allows a user to fast-forward, rewind, or skip over sections of an presentation. The 
Slider Frame can only change the time within the Sequence in which it is found, (its 
5 "home" Sequence), and any Sequences invoked via Call Frames that are executed 
within its "home" Sequence. The syntax for this tag is: 

<SLIDER: time=$time_value, region=$region_name> </SLIDER> 

where: time_value is the time, in milliseconds, at which the Slider Frame will be 
executed; and 

|0 regionjiame is the name of a region. 

P 0) <CALL> FRAME TAG 

Hi 

% The Call Frame tag invokes the specified Sequence. After the Frames that 

W comprise the content of the called Sequence are played, execution continues with the 
Frame after the Call Frame. The syntax for this tag is: 

f5 <CALL: time=$time_value, sequence=$sequence_name, 

J*; parmO=$parmO_value, parml=$parml_value, 

parm2=$parm2_value , parm3=$parm3_value, 
parm4=$parm4_value, parm5=$parm5_value, 
parm6=$parm6_value, parm7=$parm7_value, 
20 parm8=$parm8__value , parm9=$parm9_value > 

</CALL> 

where: time_value is the time, in milliseconds, at which the Call Frame will be 
executed; 

sequence_name is the name of a Sequence (should match a name that was specified 
25 in a Sequence Section Tag); and 
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ParmX_value is a value to be assigned to the X'th Sequence Variable of the called 
Sequence. The Frames that comprise the content of the called Sequence may retrieve 
the value of the X'th Sequence Variable by the variable name associated with the X'th 
parameter in the Sequence Section Tag, prefixed by a dollar sign. 

5 (k) <BUTTON> FRAME TAG 

The BUTTON Frame, once it is executed, causes a specified region to become 
sensitive to mouse clicks. From this time on, if a mouse click occurs within the specified 
region, then the Button Content Frame is executed immediately. When the mouse click 
occurs (the activation time), the Button Content Frame will be played without regard to 

10 any time parameter, even if the video player is waiting on the condition of an Until 

P Frame. The syntax for this tag is: 

Q <BUTTON : time==$time_value, region=$region_name> 

p A single frame tag, (the Button Content Frame) 

r </BUTTON> 

15 where: time_value is the time, in milliseconds, at which the button tag will be executed, 
(i.e., the time at which the region will become sensitive to mouse-clicks); and 

H region_name is the name of a region (should match a name that was specified in a 
Region Resource). 

(I) <ENTRY> FRAME TAG 

20 The Entry Frame, once it is executed, allows the viewer of the video presentation 

to enter data. In order to begin entering data the user need only place the mouse cursor 
within the designated region, and begin typing. Upon pressing the enter-key any 
entered text is transferred to a special variable. When the enter-key is pressed (the 
activation time), the Entry Content Frame will be played without regard to any time 

25 parameter, even if the video player is waiting on the condition of an Until Frame. The 
syntax for this tag is: 
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<ENTRY: time=time_value, region=region_name, 
style=style_name, name=text_yariable_name> 

A single frame tag (the Entry Content Frame) 

</ENTRY> 

5 where: timej/alue is the time, in milliseconds, at which the Entry Frame will be 
executed; 

regionjiame is the name of a region (should match a name that was specified in a 
Region Resource); 

stylejiame is the name of a style (should match a name that was specified in a Style 
|0 Resource); and 

Text_variable_name specifies the name of a variable to hold the entered text. 

pj (m) <IF> FRAME TAG 

m 

The If Frame performs a test on the specified Variable Resource. Depending on 
H the outcome of the test, the If Frame either executes, or does not execute the If Content 
45 Frame. If the Resource Variable "passes" the specified test then the If Content Frame 
f I will be executed immediately, and before the next Frame in the Sequence. If the test 

passes, then the If Content Frame will be played without regard to any time parameter. 

The syntax for this tag is: 

<IF: time=time_value, variable=$variable_name, 
20 test=$condition, value=$test_value> 

A single frame tag (the If Content Frame) 

</IF> 

where: time_value is the time, in milliseconds, at which the Entry Frame will be 
executed; 

25 variablejiame is the name of a Variable (should match a name that was specified in a 
Variable Resource Tag); 

23 



File Number 20980-20 



Test is one of: equal, not_equal, greaterjhan, lessjhan, greater Jhan_or_equal, or 
lessjhan_pr_equal; and 

value is an alphanumeric value to which the specified Variable Resource will be 
compared, 

(n) <NEW> FRAME TAG 

The NEW tag, once it is executed, causes the video player to load and play a 
new presentation for the user. The syntax of this tag is: 

<NEW: time=$time__value, url=$locator_value> </NEW> 

where: time_value is the time, in milliseconds, at which the NEW tag will be executed; 
and 

locator_value is the url of a new (compiled) presentation file for the video player to play. 
Operation of the Video Player 

As has been previously discussed, Figures 4 and 5 demonstrate the method of 
video authoring in which a video presentation is associated to a series of slides, each 
slide displaying tangential content, allowing the user to jump to a location within the 
video, or allowing to user to access tangential content. The end result of video 
authoring is a presentation script. In the preferred embodiment, the presentation script 
is a VXML script, although other formats could also be used. 

When the video player application begins execution, it parses a VXML 
presentation script that describes the relationships among the video, the slides, and the 
tangential content By way of example, in one embodiment, a system using the 
methods shown in Figures 4 and 5 can create the following presentation script file that 
will cause the user interface illustrated by Figure 2 to be presented to the user: 
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<VXML> 
<RESOURCES> 

5 < REG I ON: name="slide_top_lef t " x="220" y="10" width="100" 

height="100" > 
</REGION> 

< REGION: name="slide _top_right" x="330" y="10" width="100" 
10 height="100" > 
</REGION> 

< REGION: name="slide_right_upper" x="530" y="130" width="100" 
height= M 100" > 
15 </REGION> 

< REG I ON: name="slide_right_JLower " x="530" y="250" width="100" 
height ="100" > 
</REGION> 

m 

Ci < REGION: name=" slide JLef t_upper" x="10" y="130" width="100" 

ft height="100" > 
ill </REGION> 

13 < REG I ON: name="slide_lef t_lower" x="10" y="250" width="100" 

JJf height="100" > 
O </REGION> 

s < REGION: name="movie^rect " x="160" y="120" width="320" 

|Q height="240" > 
U </REGION> 

V] < REGION: name="play_butt_rect " x="260" y="360" width="50" 

«? height-" 50" > 
§§ </REGION> 

<REGION: name="pause_butt_rect " x="330" y="360" width="50" 
height="50" > 

</REGION> 

40 

<VIDEO: naine="main_movie" url="http: //13i . com/main. mov" 
region="movie_rect" > 
</VIDEO> 

45 <VIDEO: name="sub_movie" url="http : //13i . com/ tangent iall .mov" 

region="movie_rect" > 
</VIDEO> 

<VARIABLE: name="slide_done" initial="0" > 
50 </VARIABLE> 

</RESOURCES> 



55 <SEQUENCE name="main" > 
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< IMAGE : time="0" url="playjDUtt . jpg" region="play_butt_rect" > 
<BUTTON: time="0" region- "play_butt_rect "> 

<VIDEO: name="rnain_movie" cmd="play" > 

</VIDEO> 
5 </BUTTON> 

<IMAGE: time="0" url="pause_butt . jpg" region="pause_jDutt_rect " > 
<BUTTON: time="0" region="pause_butt_rect"> 

<VIDEO: name="main_movie" cmd="stop" > 
10 < /VI DEO 

</BUTT0N> 



<VIDE0: name="mainjriovie" time="0" cmd="play" 
15 media_time="0" > 
< /VIDEO 

< IMAGE: time="5000" url="slidel .jpg" region-" slide J: opJLeft" > 
</IMAGE> 

W <BUTT0N: time-"5000" region="slide_top__lef t"> 

p-% <CALL: sequence="show_html__text" 

III text__file_name="slidel.html" > 

* r ; </CALL> 

I! </BUTT0N> 

Kl < IMAGE: time="8000" url="slide2 .jpg" region="slide_jtop_right" > 

0 </IMAGE> 

fjj <BUTT0N: time="8000" region-" slide__top_right"> 

s <CALL: sequence="show_sub_movie" movie_name=="sub_movie" > 

M </CALL> 

1 ; </BUTTON> 

<IMAGE: tirae="11000" url="slide3 , jpg" region="slide__right_upper " > 
W </IMAGE> 

3® <BUTT0N: time="11000" region-" slide_right_upper"> 

H <CALL: sequence="show_image" image_f ile_name="image_abc. jpg"> 

</CALL> 
</BUTT0N> 

40 < I MAGE: time="14000" url="slide4 . jpg" region="slide_right_lower" > 

</IMAGE> 

<BUTT0N: time= n 14 001" region-" slide_right_lower"> 

<CALL: sequence-" show_sub_movie" movie__name="sub_movie" > 
</CALL> 
45 </BUTT0N> 

< IMAGE : time-"17000" url-"slide5 . jpg" region-"slide_lef t_lower" > 
</IMAGE> 

<BUTTON: time-"17000" region="slide_lef t_lower"> 
50 <CALL: sequence="show_html__text" 

text_f ile_name="slide5 » html" > 
</CALL> 
</BUTT0N> 

55 <IMAGE: tirue-"20000" url="slide5 . jpg" region-"slide_lef t_upper" > 
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</IMAGE> 

<BUTTON: time="20000" region="slide_lef t_upper"> 

<CALL: sequence^ "show_image" image_f ile_name="irnage_abc . jpg"> 

</CALL> 
</BUTTON> 



</SEQUENCE> 

< SEQUENCE: name="show_html_text" parmO= n text_f ile_name" > 
<SET: variable="slide__done" value="0"> 
</SET> 

<VIDEO: name="main__movie" cind="stop" > 
< /VI DEO 

<VIDE0: name="main_movie" cmd="hide" > 
< /VI DEO 

<SCRIPT: time="5001" region="movie_rect " url=$text_f ile_name> 
</SCRIPT> 

<BUTT0N: region="movie_rect " > 

<SET: variable="slide_done" value="l" > 
</SET> 

</BUTT0N> 

<UNTIL: variable="slide_done H test-"equal ,T value="l"> 
</0NTIL> 

<VIDE0: name="mainjiiovie" cmd= n show" > 
</VIDE0> 

<VIDE0: name="main_movie n cmd="play" > 
</VIDE0> 

</SEQUENCE> 

<SEQUENCE: name="show_sub_movie" parmO="movie__name" > 
<SET: variable="slide_done" value="0"> 
</SET> 

<VIDE0: name="main_movie" cmd="stop" > 

</VIDE0> 

<VIDE0: name^"main_movie" cmd="hide" > 
< /VI DEO 

<VIDE0: riame=$movie_name cmd= TT play" > 
< /VI DEO 

<BUTT0N: region="movie_rect " > 

<SET: variable="slide_done" value="l" > 
</SET> 

</BUTT0N> 

<UNTIL: variable="slide_done" test="equal" value= f, l"> 
</UNTIL> 

<VIDE0: name=$movie_name cmd="stop" > 
< /VI DEO 

<VIDE0: name="main movie" cmd="show" > 
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</VIDEO> 

<VIDEO: name="main_movie" cmd= n play" > 
< /VIDEO 

5 </SEQUENCE> 

<SEQUENCE: name== n show_image n parmO="image_f ile__name" > 
<SET: variable="slide_done" value="0"> 
</SET> 

10 <VIDE0: name="mainjnovie" cmd= n stop" > 

</VIDE0> 

<VIDE0: name="main_movie" cmd="hide" > 
</VIDE0> 

15 <IMAGE: name=$image_f ile_name region="movie_rect " > 

</IMAGE> 

<BUTT0N: region="movie_rect" > 

<SET: variable="slide_done" value="l" > 
20 </SET> 
H </BUTT0N> 



<UNTIL: variable="slide_done" test="equal" value="l rT > 
</UNTIL> 

<VTDE0: name="main_movie" cmd="show" > 
< /VI DEO 



m <VIDE0: name="main_movie" cmd="play" > 

30 </VIDE0> 
h"* </SEQUENCE> 

f : "' K 

U </VXML> 

ul 

33 The script above defines a presentation containing a video in the center 635, six 

u 

slides (605 - 630) that appear at predetermined times around the perimeter of the video 
it plays, and instructions specifying what to do when those slides are selected. 

First the resources are added. Several regions that comprise the slide positions 
are added, they are descriptively named ; "topjeft" 605, "topjrighf 610, "rightjjpper" 

40 615, "right Jower 620, "ieftjower" 625, M left_upper" 630. Then the movie region 635 is 
added followed by a region for one "subjmovie" (also 635) and two buttons (640 and 
645). The movie region 635 is where the main presentation movie will play, the sub 
movie region (also 635) is where a "tangential" movie will play, and the two button 
regions (640 and 645) are where the user will click to control the playback of the movie 

45 by starting and stopping it. 
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After the resources, the main sequence follows. First it displays the two movie 
control button images ("piayjwtt.jpg" and "pause_butt.jpg") and declares the two 
"button" frames enabling the "play" and "pause" movie control. It then tells the main 
movie to start playing with a "VIDEO" frame. After the video plays for 5 seconds (5000 
5 miliseconds), the first slide is displayed and its associated button becomes active (by 
the declaration of the button frame). Notice in the button frame, the instruction calls a 
sequence called "showjitmljiext". This means that if a user clicks on the button, then 
that sequence will be called and the instructions in that sequence will be executed. The 
"showJitml_text" sequence will pause the main video, hide it, then show some HTML 
10 text in the main movie window. (The HTML text file name is a parameter to the 

sequence and in this case it is a file called "SLIDE1 .HTML"). It will then wait until the 
K user clicks in that window and then return to the main movie. Upon the return to the 

main movie, the movie continues and the other slides appear. Slides 2 through 6 
p appear in the same fashion at 8, 1 1 , 14, 17 and 20 seconds, respectively, into the main 
IS movie. Although the slides appear in the same fashion, they have different behaviors 
UJ when clicked on. Clicking on Slide 2 will call a sequence ("show_sub_movie") that 
y, pauses and hides the main movie, then shows a different movie. Slide 3 will call a 

s. . 

sequence ("showjmage") that pauses and hides the main movie, then shows an 
iy image. 

ill 

id One skilled in the art will readily see that the presentation script allows the 

multimedia presentation (with its primary video and tangential content) to be easily built 
and later modified. By simple changes to the presentation script, different content can 
be incorporated into the presentation. Also, by simple changes to the presentation 
script, different topologies can be defined. For example, the video window can appear 

25 in the corner with the slides appearing down the side. The way the script is defined 
enables the linking of any "tangential" content to media playback. In a preferred 
embodiment, if a new type of content is defined, the player can be extended and a new 
VXML tag can be added. Also in a preferred embodiment, the player will be built to 
extend itself to handle a previously undefined tag by accessing a module of code that 

30 can dynamically attach to the main body of the player code using a "plug in" system. 

29 



File Number 20980-20 



In another variation, the slide regions could correspond to steps within an activity. 
The slide regions could then cause the video in the video window region 1 130 to jump 
to the portion of the video describing the first step if the top left region slide 1 1 1 0 is 
selected, and jump to the portion of the video describing the second step (if the top right 
5 region slide 1 120 is selected. As yet another variation, rather than displaying a graphic 
image retrieved from a URL, the slide regions 1110 and 1 120 can be configured to 
display any other type of tangential content, such as text, secondary videos, access to 
websites, audio clips, etc. While the current script gives the user the ability to stop and 
start the playing of the video by selecting the buttons, in other variations, the video 
10 player can present to the user a control panel with controls such as fast-forward, pause, 
and rewind. 

Architecture of the Video Player 

h Fig. 7 is a block diagram illustrating how the presentation engine relies on script 

data files to provide the video player to the end user. In Fig. 7, the video player 

VS. software 300 resides as a computer application on the end user's computer, on a server 

J s , of a network, as an web applet, or the like. The video player software 300 parses 

\* scripts 310 to display the video and associated tangential content to the end user. The 

Uj content 325 may be already loaded on the computer or may be available over the 

H Internet or other network. The code blocks 320 are linked to the video player software 

20 300 so that new functionality can be easily added. 

While the specification describes particular embodiments of the present 
invention, those of ordinary skill can devise variations of the present invention without 
departing from the inventive concept. Although the scripts as explained herein are 
simplified for conceptual demonstration, one of ordinary skill in the art can easily use 
25 this disclosed information to create more complex scripts. 
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